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Abstract 


The final step in getting an Israeli M.D. is performing a year-long 
internship in one of the hospitals in Israel. Internships are decided 
upon by a lottery, which is known as “The Internship Lottery”. In 
2014 we redesigned the lottery, replacing it with a more efficient one. 
The new method is based on calculating a tentative lottery, in which 
each student has some probability of getting to each hospital. Then 
a computer program “trades” between the students, where trade is 
performed only if it is beneficial to both sides. This trade creates 
surplus, which translates to more students getting one of their top 
choices. The average student improved his place by 0.91 seats. The 
new method can improve the welfare of medical graduates, by giving 
them more probability to get to one of their top choices. It can be 
applied in internship markets in other countries as well. 

This thesis presents the market, the redesign process and the new 
mechanism which is now in use. There are two main lessons that we 
have learned from this market. The first is the “Do No Harm” princi¬ 
ple, which states that (almost) all participants should prefer the new 
mechanism to the old one. The second is that new approaches need 
to be used when dealing with two-body problems in object assign¬ 
ment. We focus on the second lesson, and study two-body problems 
in the context of the assignment problem. We show that decomposing 
stochastic assignment matrices to deterministic allocations is NP-hard 
in the presence of couples, and present a polynomial time algorithm 
with the optimal worst case guarantee. We also study the performance 
of our algorithm on real-world and on simulated data. 


I 


1 Introduction 


Prior to receiving their medical degrees Israeli medical graduates (and for¬ 
eign trained doctors) must participate in a year-long internship program (not 
to be confused with the residency period following their graduation). The 
internship is carried out in one of 23 possible hospitals in the country, which 
are allocated interns relative to the size of their patient population (with an 
advantage to peripheral hospitals), with the smallest allocation being of 4 
interns. The internship is perceived as a tax that must be paid: there is 
(almost) no correlation between the hospital in which internship was per¬ 
formed to the hospital in which residency is performed, interns work nights 
and weekend to receive below minimum wage, and are not getting any tuition 
during the internship 0 

By and large, interns are not assigned to hospitals on the basis of merit, 
since the government wants to spread the more talented interns everywherejj 
Since merit is not a criterion, it makes sense to use some form of lottery 
to assign the interns. For many years, the lottery which was chosen was 
Random Serial Dictatorship (henceforth RSD)0, with some house rules. 

In this thesis, we describe the market and the different populations of 
students, discuss the old lottery, present our new design for the lottery, which 
was used first in 2014 (and is also in use this year) and present the results of 
the implementation. We focus this work on the lessons we learned, and on 
techniques that can be used in designing future markets. More specifically, 
this thesis deals with 


1. the “Do No Harm” principle, and the constraint that under some metric 
no student will be worse off in the new match, 

2. the trade-off between efficiency and truthfulness, and 

3. two-body problems in the context of the assignment problem. 

In addition, we describe the market, the participants, and share the available 
data. 

1 One justification for this tax is that medical studies are highly subsidized by the 
government, with a tuition cost of around 2500$ per year. In return, the government uses 
the interns as cheap labor in hospitals (all hospitals are operated by the government). 

2 There is an exception to this rule - five interns with a PhD get to choose where they 
want to be assigned. 

3 Often referred to also as “Random Priority”. See, for example, jlj. 
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1.1 Background on the internship market 


Each year, two different cohorts of students enter independent lotteries which 
determine where each student performs her internship. The number of interns 
allocated to each hospital is determined independently for each cohort. 

The first cohort is composed of students who are in their final year of 
their medical school in Israel. This cohort is the more interesting and diverse 
one, and contains four populations: 

1. a handful of students (usually with a PhD) who get to choose their 
internship. From a market design perspective, they can be treated as 
reducing the capacity. 

2. Couples who wish to be in the same hospital. Unlike the (American) 
National Residency Matching Program (NRMP), there is no notion of 
preferences over pairs of hospitals. This makes sense: in the NRMP 
one spouse may want to be a radiologist, and the other may want to 
be a psychiatrist (so they need to rank pairs of programs), whereas in 
the current match everyone wants to (well, doesn’t want to, but has to) 
be an intern. Around 10% of the interns are involved in a couple, and 
most couples are not married, but are just friends who want to share 
an apartment if they are stuck in some village where they do not know 
anybody. Unlike the NRMP, couples are guaranteed to be interns in 
the same place. 

3. Students who have kids. This population is characterized by not want¬ 
ing to transfer the family. Hence, they are usually willing to take any 
hospital which is a driving distance from home, and are less sensitive 
to hospital’s quality. 


4. The rest of the students. 


In the last decade, the lottery that was used was Random Serial Dicta¬ 
torship, with a few small modifications: 


1 . 

2 . 


The five students who get to pick, choose where they want to go. 

A couple is considered as one student for the matter of choosing the 
permutation. Alternatively, each couple gets two consecutive spots. If 
the couple’s top choice among hospitals which have vacancies is A, but 
A has just one free spot, they are not allowed to go there and will keep 
going down their preferences]^] 


4 It is not clear what happens if all hospitals have just one vacancy when a couple is 
chosen. Maybe it just did not happen in recent years. The medical students believed that 
couples will always stay together, and this was one of their requirements. 
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3. Every year there is a vote about what to do with students with kids 
(sometimes an advantage is given only to students with two or more 
kids). The options are to treat them as regular students, to let them 
take part in the lottery but promise them a seat in their district (the 
country is divided into five districts for that matter), or to promise 
them a seat in their district, without participation in the lottery (so 
they lose the chance to get good hospitals). When the last option is 
being used, most parents do not declare themselves parents, as they 
claim that this is not really a benefit. 

The second cohort consists of graduates of foreign schools, who want to 
become Israeli doctors. This only applies to students who studied in certain 
countries and have no experience, and most of this cohort is composed of 
Israelis who were not accepted to medicine in Israel, and studied in Jordan 
or in Europe with the intention of going back to Israel after graduation. 

For the foreign cohort, the internship lottery is run by the Ministry of 
Health (MoH). There are no votes or appeals, and rules are simpler. For 
the Israeli cohort, the MoH just decides on the capacities, and delegates the 
rest of the lottery process to a committee of students, elected by the student 
body. It is common that the committee puts important decisions (such as 
the parental benefits) to vote by the entire student body. 

2 The redesign process 

We were first approached in 2010 by several medical students who asked for 
help in redesigning their market. They had two things which worried them: 

1. giving fair benefits to parents. They felt that parents should be treated 
differently, but it should not be at the expense of other students, and 
were not sure how to do this, and 

2. they wanted to improve the efficiency of the system. 

We took part in the internship committee of 2011, to understand their de¬ 
mands better. They added a couple of technical requirements: 

1. Do No Harm : The redesign process should not hurt any population of 
students. Everyone should be (weakly) better off in the new design (at 
least in expectation). 

2. In particular, students should have the option of matching together, 
just like they did up until then. A couple should get a guarantee that 
they will be matched together. 
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The 2011 class was also generous enough to share their data with us. Our 
initial idea was to use some form of Probabilistic Serial jl3|. However, the 
results were very similar to RSD, and we wanted to improve efficiency. Hence, 
we decided to use a rank efficient mechanism, which essentially amounts to 
maximizing some linear function on the number of students who get their 
i’th rank, for every i (more on that in lsection 31) . 

To cope with the DNH principle, we ran surveys, in which we gave stu¬ 
dents two possible vectors of probabilities (for being assigned to each hospi¬ 
tal), and asked them which is better. We took the data, and tried to find 
a simple utility model for it. The model we ended up using is that there 
are m hospitals, and you have probability p to get to rank i, this gives you 
p{m — i +1) 2 points of happiness, and a profile with more happiness points is 
better. While one could find a better fit to the model using more parameters, 
we were happy that there are no constants that we need to compute. We 
also defined a different happiness function for parents, but it was not used 
(see below). 

Given that we have a definition of happiness, we first compute for every 
intern what would be her expected happiness under RSD. Then we choose al¬ 
location probabilities to maximize total happiness, conditioned on capacities 
and on the DNH principle - that every agent is happier under the current 
algorithm. Given the probability matrix, we decompose it to a convex sum 
of assignment matrices (more on that in Isection 5H . and choose a matrix at 
random (according to the weights). 

Switching to the new mechanism We proposed the new mechanism to 
the class of 2012, but were rejected. They did not want to have anything to 
do with computers, and were afraid of bugs and backdoors. 

For the 2013 class we approached the MoH. We understood that the MoH 
is less conservative than the student body, as it has more to gain from new 
ideas: if the student body is given a new idea who would make it better off 
for with probability 0.4 and worse off with probability 0.6, then they should 
reject it. However, under these terms it is in the MoH’s best interest to try 
the new idea. If the new idea is a failure than one class pays the price, but 
if it is a success then future generations will use it as well. In the MoH we 
found moral support, more data, but no willingness to try the new system. 
Their main argument was that they do not want to interfere in something 
which is traditionally run by the students, and are not willing to use the 
foreign trained students as guinea pigs. 

For the 2014 class we went back to the student internship committee. 
At this stage we had a large quantity of historical data to present about 
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both algorithms, and also had a new idea: we will not run the lottery with 
a computer, but rather hand them the decomposition of the matrix into a 
weighted list of assignments, and let them choose the assignment. This way 
they can compute their own marginal distributions, and verify that there is 
no backdoorlfl This persuaded the committee to put this for a vote (along 
with the parental benefits for that year), with the condition that the new 
algorithm will be used only if it wins an absolute majority. 80% of the medical 
graduates who participated in the poll (55% of the students) voted in favor 
of the new approach. Following the successful vote, we indeed implemented 
the algorithm and it was used to assign interns on 2014. In addition, it was 
used for the foreign cohort that year. A subsequent vote of the following 
cohort of interns, that took place on January 2015, certified the continued 
use of our algorithm for 2015, and the foreign trained interns will keep using 
it until further notice. In order to increase transparency we explained the 
student body the rules of the new match 0- 

We have ran a survey in the class of 2014, presenting students with their 
marginal distribution under RSD and under the new algorithm, and received 
supportive feedback. However, we were asked by the student committee to 
distribute the survey only to committee members, as we ran it after the 
lottery was over and they wanted to let the dead rest. This turned out to be 
very unfortunate, as the class of 2015 was mad at us for not handing each 
student in 2014 his or her marginal distribution. We have agreement from 
this year’s committee to perform the survey with the entire body of students. 

In the years 2014 and 2015, the students voted against giving any parental 
benefits. We think that the main source for this vote is not our algorithm, 
but a new school of medicine which accepts students who already earned an 
undergraduate degree (usually in Biology) and want to undergo retraining 
and perform a career change. The graduates of this school constitute the 
majority of parents in the market, and therefore it became an issue of “help¬ 
ing students in the other school” which attracts less sympathy than “helping 
the parents in my class”. 

Truthfulness and efficiency Our main concern with the new mechanism 
is that we sacrifice truthfulness (at least to some degree) to achieve efficiency. 
This is probably a necessity: (32] showed that in large random market all 
asymptotically efficient, symmetric, and asymptotically strategy-proof mech¬ 
anisms are equivalent to Random Serial Dictatorship. While in theory it 

5 To be more exact, they can compute their marginal distributions to see that they are 
better off, and the distributions of the members of the committee to see that no one is 
cheating. 
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could be that our market is not large enough, or that there is something spe¬ 
cial in the valuation profiles we face, it is unlikely that this is the case. Hence, 
any improvement in the welfare will lead to a non-truthful mechanism, such 
that some agents will be able to deviate and make a non-negligible gain. 

When choosing between truthfulness and efficiency, we need to remember 
that truthfulness is a means, while efficiency is an end. We see two main 
arguments for truthful mechanisms: 

• Truthful mechanisms reduce cognitive burden for the participants. 

• When bidding is truthful, the designer can evaluate the welfare. When 
bidding is not truthful, such evaluation is not possible. 

We think that for this market, one can mitigate the disadvantages of a non- 
truthful algorithm, and the gain in welfare is big enough to warrant some 
degree of untruthfulness. Specifically, we believe that bidding today is truth¬ 
ful (as much as anything can be truthful in ranking 25 priorities), and that 
if in the future a large body of students would bid untruthfully, then discus¬ 
sions in forums and on Facebook revolving around strategic bidding would 
emerge. We know that in the past, it was a common (and illegal) practice to 
sell and buy internship positions after the lottery]^ Negotiating these deals 
was done through social media, and it was not too difficult for us to fold it. 
Discussing bidding strategies should happen through the same channels, and 
be less hidden (since there is nothing wrong about it). 


3 The new mechanism 

We know that after the assignment is done, there is no room for trade. Still, 
we want to let the students trade to achieve a better allocation. Therefore, 
instead of trading seats in hospitals, we trade probability shares. Indeed, 
a student i attending the lottery should not care too much about the in¬ 
ternal mechanics — all he should care about is i,.. .p i)2 3 where Pi^ is the 
probability student i gets his fc’th choice. 

6 In theory, there should be no trade following a lottery if monetary transfers are not 
allowed. In practice, internship begins at least ten months (and sometime eighteen months) 
after the lottery is conducted, so it makes sense that someone who would want a desired 
hospital in city A would want to move to city B for some personal reason. Moreover, as 
mentioned, illegal monetary transfers were conducted. The students would approach the 
MoH, asking to trade places, hiding any financial agreement, and would be granted the 
permission to trade. The market price of a good internship used to be 2500$. 
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Looking at the probability vector each student gets, RSD turns out to be 
far from optimal. Consider for example the following fictitious lottery, which 
involves four students and four hospitals, each with capacity 1 [Table [Tj. 


Table 1: The four students’ prioritized preferences over hospitals. All stu¬ 
dents ranked hospitals A and B as their first or their second choice respec¬ 
tively. Alice and Diane each ranked hospital C as their third choice and 
then hospital D, while Bob and Charlie each ranked hospital D as their third 
choice and C as their fourth. 


Alice 

Diane 

Bob 

Charlie 

A 

A 

A 

A 

B 

B 

B 

B 

C 

C 

D 

D 

D 

D 

C 

C 


We analyze the probability that Alice gets each hospital in this example: 

• With probability \ she is the first to choose. In this case she chooses 
A, and therefore Pr(A) = |. 

• With probability | she is the second to choose. In this case A is already 
taken, and therefore she chooses B. Therefore Pr (B) = 

• With probability | she is the third. In this case, A and B are already 
taken, and Alice chooses C. But this is not the only way Alice can get 
C. If Alice is the last to choose, then it is possible that C is still open, 
and she can take it. Indeed, if Diane goes first and takes A, Bob goes 
second and takes B, Charlie is third and takes D, then Alice can get 
C although she is the last to choose. Looking at this more closely, one 
can see that if Alice is fourth and Diane is not the third, Alice would 
also get C. This means that the total probability that Alice gets C is 
Pr(C) = A 

• With probability A, Alice is fourth, and Diane was third. This is the 
only case in which Alice gets D, and therefore Pr(D) = A. 

One can verify that the sum of probabilities is 1, and Alice is always assigned 
to a hospital. 

A similar argument shows that Bob has probability of \ to get A, prob¬ 
ability of j to get B, probability of A to get D and probability of A to get 
C. 
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Figure 1: Alice gives Bob her probability of being assigned to H , and in 
return she gets half of this probability to her first choice and half of it to her 
last choice from Bob. 
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Note that in this simple example, Alice and Bob could trade probabilities, 
and this would benefit both of them. Imagine that Bob could somehow give 
Alice his probability of being assigned to C, and in return she would give 
him her probability of being assigned to D. This would result in a state in 
which Alice has probability of | for A, | for B and | for C. Bob would have 
probability of ^ for A, | for B and | for D , which is an improvement for 
both of them, compared to the RSD probability shares. Charlie and Diane 
could trade probabilities among themselves in a similar manner. 

While this already improves the current state of affairs, we can do even 
better. Suppose that Alice and Bob agree on their first and tenth choices, 
but Bob’s second choice is some hospital H , which is also Alice’s ninth choice. 
Also, suppose that Alice has some positive probability p of being assigned to 
H. In this case both students would possibly be happier if Alice “gives” Bob 
her probability p of getting to H , and Bob would “give” her | probability of 
getting to his first place, and | probability of getting to his tenth place (see 
Figured]). In this case: 

1. Assuming there is no huge gap between the ninth place and the tenth 
place, Alice should be happy. She “lost” probability p of getting to 
her ninth place, and received probability of | to get to her tenth place 
(similar), and probability of | to get to her first place. 

2 . Assuming there is no huge gap between the first and second place, 
Bob should be happy. He lost | probability from his first place and 
f probability from his tenth place, to get p probability for his second 
place. 

While clearly this trade is beneficial, it raises a couple of subtle points, 
which are related: 
















1. Why should Bob give | of his tenth place and | of his first place? Why 
not y of his tenth place, and | of his first place, or vice verse, or some 
different numbers? 

2. The difference between first and second place is usually larger than that 
between ninth and tenth place. 

As explained below, the students were asked to fill surveys, to assert the 
difference between the first and the second place, the second and the third 
place and so on. Based on the surveys results’, more weight was given to the 
difference between first and second place than to the difference between the 
ninth and the tenth. 

One question which was not addressed in these simple examples is how to 
decide which students should trade with whom, and what trades to perform. 
Of course, the students do not actually trade with each other, but rather 
a computer program “virtually” trades on their behalf. We also use more 
complex trades, which may involve three or more students at once, if they 
benefit all the participants. 

Another question is how do we perform the lottery at the end of the 
process? With RSD, we could just choose an ordering over the students. 
But in the first example, what lottery gives Alice probability of | to get to 
C and Bob probability \ to get to D1 Note that if each student would just 
select a random hospital according to the probabilities, it is possible that 
two students would be assigned to the same hospital, so this is not a valid 
solution. 

In the rest of the section, we formally explain our method, and solve the 
two questions presented above. 

3.1 Description of the new lottery 

Our method works as follows. First, approximate the probability of each 
student to be assigned to each hospital using RSD. We do this by running 
a large number of trials, N, and by the law of large numbers the average 
of all those RSD lotteries will be sufficiently close to the true value. The 
probability is calculated as 



( 1 ) 


where is the number of RSD lotteries in which Student i was assigned to 
the hospital he ranked as his k -th choice. 

Once we have the approximated probabilities we continue with the second 
stage of the algorithm which is trading the probabilities among the students. 
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We do this using Linear Programming , which is a mathematical optimization 
method for maximizing a target function subject to several constraints 30], ^1 
In our interns assignment problem there are two constraints: 


1. Each hospital has an upper limit for the number of interns that can be 
assigned to it. This capacity constraint is determined by the Ministry 
of Health. 

2. No student is worse off compared to what he would have got under 
RSD. This individual rationality constraint is enforced by defining the 
happiness of each student from his vector of probabilities, and then 
requiring that for every student individually the happiness can not 
decrease by the trading stage (intuitively if this would decrease his 
happiness the student would not trade) 


As for the target function, we want to maximize the total satisfaction of 
the students after trading. The full description of the constraints and the 
target function appears in Appendix [A] 

After the optimal probabilities have been acquired, we only need to ran¬ 
domize an allocation according to these probabilities. This, however, is not 
an easy process, as we want the lottery over valid allocations to respect all 
the interns’ probability allocations simultaneously. Fortunately, the Birkhoff- 
von Neumann theorem provides a solution to this problem [12, ]4JJ . In order 
to apply the theorem for the allocation problem, we represent the probabil¬ 
ities data we have until this stage with a matrix of size n x m, where the 
rows are the interns, the columns are the hospitals, and cell (i, j) represents 
Student f’s probability to be assigned to Hospital j. The theorem ensures 
that any random assignment of the objects in the rows of the matrix to the 
objects in the columns can be implemented. Furthermore, Birkhoff’s proof 
provides a constructive algorithm for the implementation [36|. Using an ex¬ 
tension of this theorem we can create a lottery which respects the improved 
probabilities gained by trading 


4 The new lottery results 


Figure [2] depicts the number of students who were assigned to one of their 
top k choices as a function of k. The gap between the two curves shown 
in the figure (the area under the dashed curve) represents the improvement 


7 Similar approaches to object assignment have been already suggested. See, for exam¬ 


ple, ]21| and 
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Figure 2: The number of interns as a function of being assigned one of their 
top choices. The solid line shows the results of RSD, and the dashed line 
shows the result of the new method. 



of the new method compared to RSD. While RSD would assign 203 interns 
to their first choice hospital, 50 to the second and 59 to the third, the new 
method assigns 216 interns to their first choice, 84 to their second choice and 
70 to the third choice. 

Furthermore, using our data we would like to rate hospitals, according to 
how high the interns ranked them. Such a rating can be useful, since it gives 
the hospitals a better picture of their status, and raises a red flag if a specific 
hospital should improve the way it treats its interns or signals quality to the 
Ministry of Health. Before we aggregate the data to create such a rating, it 
is useful to look at the ranking distribution of a specific hospital, and see if 
it makes sense. 

For example, Hadassah Medical Center’s ranking distribution is depicted 
in Figured About 10% of the interns ranked Hadassah as one of their top 
three choices, which possibly indicates that they are residents of Jerusalem 
and that location is important to them. The rest of the students ranked 
Hadassah around the middle of their rank order list, suggesting that interns 
consider Hadassah to be a good hospital, and that the demand for being an 
intern there is quite high. 

We now aggregate the rankings of all students, to create a rating across 
hospitals. We compare two methods of rating the hospitals. The first method 
is to use the same weights that we used when defining students’ happiness, as 
described in Eq. [3]i| The second method is the traditional rating of hospitals 
in this lottery, which is based on the number of interns who ranked a specific 
hospital as their first choice. Figure [I] demonstrates that our new rating 
approach provides very different results from the traditional rating approach, 

8 The rating we get using this method is very similar to the rating one gets using the 
more familiar Borda Count method. 
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Figure 3: The number of interns who ranked Hadassah hospital in each place. 
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and it is perhaps more advisable to use it as it takes into account the entire 
rank order lists of all students. 

Comparing the two ratings, one can see several differences: 

1. In the traditional approach, Rabin-Hasharon comes out last, although 
Rabin-Beilinson came in second. The reason that this happens, is that 
every student rates Rabin-Belinson above Rabin-Hasharon, so Rabin 
Hasharon is never first. However, the difference is very small - they 
are both campuses of the Rabin Hospital, and are 10 minutes apart Jj 
In the new rating, Bclinson comes in first (but almost at a tie with 
Ichilov), and Hasharon comes sixth. 

2. The last hospitals in the traditional rating have very low scores, and a 
single student who changes his vote can change the rating. The new 
rating is much more robust. 

3. The decay in the score makes more sense in the second rating. It is 
never too sharp, and there is no huge difference between the first two 
places. 

We take the new rating as further evidence that the values at which the 
algorithm trades probabilities make sense. 


5 Couples in the assignment problem 

This section will deal with the students internship committee’s second condi¬ 
tion, and how we were able to comply with it. In the past, any two students 

Previously they appeared together in the internship forms under ’’Rabin” and a second 
lottery was performed between the students to see which student goes to which campus 
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Figure 4: A comparison between two rating approaches. [A] The number of 
interns who ranked the given hospital as their first choice. [B] The calculated 
value of the given hospital according to the survey-based weights. 




were allowed to declare that they are a “couple” and be matched together 
(that is, to the same hospital) by receiving only one joint turn in the Ran¬ 
dom Serial Dictatorship. The committee required us to give couples the 
same option under the new algorithm. Similar to several other algorithms 
for assignment of indivisible objects (e.g., (28, 0), our algorithm “trades” 
probabilities among participants (based on their preferences), and reaches a 
stochastic matrix that has to be decomposed into a convex combination of 
valid assignments. The last step uses the famous Birkhoff-von Neumann de¬ 
composition process 12l. l44|. The fact that couples cannot be split imposes 


(hard) complementarity constraints, which invalidates some of the assign¬ 
ments, and consequently the decomposition process may become harder or 
even impossible. On the bright side, only stochastic matrices in which both 
members of the couple get exactly the same probability for each hospital are 
candidates for decomposition. Is this little advantage enough to make the 
problem solvable? And if not, what could be done? 

In this work we first show that not only is the problem often not solvable 
when there are couples, but also that it is NP-hard to determine whether a 
given stochastic matrix with couples can be decomposed (Theorem [T]). This 
result extends trivially to environments with more general complementarities 
between groups of participants. 
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Our second main contribution is in bypassing this impossibility result by 
providing a polynomial time approximation algorithm that outputs a convex 
combination of valid assignments that is similar to the target stochastic ma¬ 
trix. We show that the approximation is tight and behaves like 2/q, where q 
is the capacity of the smallest hospital participating in the match. In the last 
section of the thesis we consider several extensions of the algorithm that can 
be useful for other applications. We also use anonymized preferences data 
provided by the MoH to subsample and test our algorithm performance. We 
find that our algorithm often performs significantly better on actual data 
(compared to the theoretical bounds we establish). 


5.1 Our Results 

1. It is NP-hard to determine whether a stochastic matrix with couples can 
be decomposed into a convex combination of deterministic assignments 
with couples. 

2. It is possible to approximate such a decomposition in polynomial time. 
The approximation is tight and its quality is reciprocal to the smallest 
hospital’ capacity. 

3. Subsampling real preferences data from the Israeli Medical Internship 
Match for which the approximation was developed shows excellent per¬ 
formance of the approximation algorithm. 


5.2 Related Work 


In the past two decades, mechanism design has been applied in ubiquitous 
contexts and markets. The first and foremost is auctions 0, 0,0000, 


but other examples include division of goods in a fair way (e.g. cake cutting 

0, H 0 0 


matching markets 

0000 . 


studying the effect and utilization of networks lid), 26, |41 
ets @, 22], and other topics in social choice 0 , 00,1313 


The assignment problem with couples is a specific case of multi-unit al¬ 
location of indivisible objects without transfers (for example, course alloca¬ 
tion), and we indeed intend for our treatment to convey a message regard¬ 
ing the more general case. [15] provides a deterministic algorithm to find 
an ex-post efficient allocation that represent an approximately competitive 
equilibrium from approximately equal incomes. 16] suggest a decomposition 


in the spirit of Birkhoff and von-Neumann, but complementarities are not 
allowed. A contemporary working paper by 0] is the closest to ours, and 
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it solves the problem of allocating bundles of indivisible objects by assuming 
(like we do) that bundles are limited in size, and that the capacity constraints 
are “soft”. While we insist on capacities being met, allowing deviations as 
those of Nguyen et al. and then correcting them in a smart way will give a 
result similar to our approximation result (Theorem |3j see also Section [7j). jl] 
suggest a different kind of approximation, by ignoring the couples constraint 
with a small probability. 

The motivation for decomposing stochastic assignment matrices comes 
from mechanisms that efficiently allocate probabilities in the interim stage. 


Notable contributions for the case of single-item assignments are [23 who 
proposes a competitive equilibrium from equal incomes approach, 13] who 
suggest the Probabilistic Serial mechanism which is ordinally efficient (i.e., 
interim efficient given ordinal preferences and not cardinal preferences), and 
[21] who studies rank-efficient mechanisms (similar in spirit to the one im¬ 
plemented by us for the Israeli Medical Internship Match). We note that 
several authors have dealt with the case of object assignment when objects 
are the endowment of some agents, in which case the top-trading cycles (or 
a variation thereof) is often the mechanism of choice (see, e.g., |40l. II]). 

Finally the effect of couples in matching problems has also been studied 
in the context of two-sided matching. See, for example, the works by [fit ] 
and ji[. A treatment more related to our current approach is provided by 
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5.3 Model and Notation 

An assignment problem with couples is a tuple (S', C, H, M ). S' is a finite set 
of single interns, C a finite set of couples of interns, with each c E C being a 
set of two interns c = {ci,c 2 }. We denote by / = S U (U c ec c ) the set of all 
interns. H is a finite set of hospitals. M E [0, l] IxH is the target matrix and 
it satisfies 

1. Vi E / : J2heH M i,h = 1, and 

2. V(ci, c 2 ) E C, h E H : M C1 h = M C2h . 

We let q h = M i: h be the capacity of hospital h, and q = min^ qh be 
the capacity of the smallest hospital. Let V be the domain of all assignment 
problems with couples. We specify two special sub-domains of V: V° =0 is 
the set of problems without couples (i.e., C = 0), and V s - c is the set of 
problems in which singles receive more weight than couples in every hospital 
(i.e., V/i G H : YlseS M s ,h > 2 M Cl ,h)- 
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A matrix M' G [0, l] IxH that satisfies conditions (1) and (2), and V7i G 
H : Yhi ei^'ih = Qh is called a stochastic assignment matrix (with respect to 
P). If all the elements of a stochastic assignment matrix (with respect to 
P) are in {0,1}, then the matrix is called a deterministic assignment matrix 
(with respect to P). A stochastic assignment matrix M' is decomposable if 
it can be represented as a convex combination of deterministic assignment 
matrices, i.e., {(A fc ,M fc )}^ =1 such that for each k, X k G (0,1) and M k is a 
deterministic assignment matrix, X k = 1, and M' = Jf J f =1 X k M k . A 

straightforward extension of the Birkhoff-von Neumann theorem shows that 
on the sub-domain V c=0 all target matrices are decomposable. In Section l5~4l 
we show that on the general domain, V, it is NP-hard to verify whether a 
target matrix is decomposable. 

We say that a convex combination of deterministic assignment matrices 
{ (A fc , } e- approximates a target matrix M if 


K 

\M-J2 xk M k \ ■ 1[J] 

fc=i 


< e, 

oo 


where lrn is a vector of ones of size J, so we approximate the maximum value 
over the L\ norm of each row. 

A decomposition algorithm A takes a problem P G V and outputs a 
convex combination of deterministic assignment matrices (with respect to 
P ). We say that A provides an f -approximation on domain V if for every 
P G V, A(P) f(P)~ approximates M. Our aim in Section I5T51 is to provide 
a lower bound for the optimal approximation on V, and an upper bound for 
the optimal approximation on V s - c . 


5.4 NP-hardness result 

Theorem 1. On the domain V , determining whether the target matrix is 
decomposable is in NPC. 

Proof. We will reduce 3-edge coloring of cubic graphs to the interns-hospitals 
assignment problem, (j29[, proved that 3-edge coloring of cubic graphs is in 
NPC). 

Let G = (V, E) be a cubic graph with \V\ = n vertices and m = 3n/2 
edges. For each edge e let us have 3 hospitals A(e), B(e), C(e), each of 
capacity 2. For each vertex v and each color i G {1, 2, 3} let us have a hospital 
(v, i) with capacity 1. Thus altogether we have 3m hospitals of capacity 2 and 
3n hospitals of capacity 1, total hospital capacity is 3n + 6m = 3n + 6-3n/2 = 
12n. 
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Now for each edge e = (u, v ) we have one couple that wants to be either 
in A(e) or in B(e) or in C(e), each with probability 1/3 and we also have 6 
single interns as follows: 

• First single wants either A(e) with probability 2/3 or (u, 1) with prob¬ 
ability 1/3. 

• Second wants either A(e) with probability 2/3 or (v, 1) with probability 
1/3. 

• Third wants either B(e) with probability 2/3 or (u, 2) with probability 
1/3. 

• Forth wants either B(e) with probability 2/3 or (v,2) with probability 
1/3. 

• Fifth wants either C(e) with probability 2/3 or (it, 3) with probability 
1/3. 

• Sixth wants either C(e) with probability 2/3 or (v, 3) with probability 
1/3. 

Note that altogether we have (2 + 6 )m = 8 • 3n/2 = 12n interns. Hence, 
any assignment of all of them uses all capacity of all hospitals. 

Suppose first that G is class 1 (that is, it is 3-edge colorable). Fix a 
proper 3-edge coloring by colors 1, 2, 3. Each such coloring corresponds to a 
full assignment of all interns as follows: 

If edge e = (u,v) is colored 1 then the couple is in A(e), First single 
intern is in (u, 1), second in (u, 1), third and forth in B(e), fifth and sixth in 
C(e). If e is colored 2 and 3 we proceed symmetrically in the obvious way. 
It’s clear that this is a full assignment. 

Now give this assignment weight 1/3, and give weight 1/3 to each of the 
two other assignments obtained from it by cyclically shifting the colors of all 
edges. This gives a decomposition of our matrix of probabilities. 

Conversely, if there is a decomposition, then any single assignment in its 
support must be a full assignment (assigning all interns and saturating all 
hospitals). But this means that for each edge e, among the 8 interns corre¬ 
sponding to this edge, 6 including the couple are assigned to the hospitals 
A(e), B(e), C(e) and there is a unique i G {1,2,3} so that two of the single 
interns are assigned to (u, i ) and (v, i ) and thus G is 3-edge colorable, as 
needed. □ 
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5.5 Approximation 

5.5.1 Lower bound 

Theorem 2. If A provides an f -approximation on V s - c , then for any n € N 
there exists P <E V s - c such that \I\ > n and f(P) > ^. 


Proof. Given n G N, let q = 4[|]. Let S = jsi,..., si g+1 , si,..., s'i g+1 }, 

C = < c*, Ci,..., ci 0 _ 1; ci,..., ci _ >, and H = {h,h'}- Consider the tar- 
get matrix M e 'P' s - C ' described in Table [2j Under this stochastic as¬ 
signment matrix each single intern in |si,..., si g+1 | and each couple in 

|ci,..., gets a probability 1 to be assigned to h, each single intern in 

s i g+ i} an d each couple in jci,... , c ’i q _ x j gets a probability of 1 to 

be assigned to h f , and the couple c* gets an equal probability to be assigned 
to either h or hi. Note that the capacity of both hospitals is q. 

In any convex combination that approximates M, one of the hospitals 
will be assigned at least \q of the couples with a weight of at least This 
means that with probability | the left over capacity for single interns is at 
most | q, where as the single interns require \q + 1. The optimal way to 
minimize the deviation from the singles probability would be to divide the 
deviation equally among all the relevant single interns. This means that 
each individual intern in the relevant hospital will get at most \ ■ 1 + | 
instead of getting 1, and to complete the individual’s total probability to 1, 
the single will get 1 — (| • 1 + \ ■ xyyy) to the other hospital (instead of 0). 
Therefore the approximation cannot be better than: 


2 



¥ \ 

50 + V 


2 

q + 2 



2 

q + 2 


□ 

One may argue that the example provided in the proof for Theorem [2] 
seems quite pathological. Indeed, the target matrix allocates many interns 
either h or h' with probability 1. There could be scenarios in which it would 
be very reasonable to consider only target matrices in which each intern’s 
probability of reaching any particular hospital is bounded by some expres¬ 
sion related to the number of hospitals (see also the concluding discussion). 
Nevertheless, in our main application the chosen algorithm involves trading 
probability between interns, and so it often outputs target matrices that are 
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Table 2: Target matrix for proof of Theorem [ 2 ] 



h 

h’ 

Si 

1 

0 


1 

0 

s h+ 1 

1 

0 

s[ 

0 

1 


0 

1 

s k+i 

0 

1 

£*,1 

0.5 

0.5 

C*,2 

0.5 

0.5 

C l,l 

1 

0 

Cl,2 

1 

0 


1 

0 

C 29-l,l 

1 

0 

C i9-1,2 

1 

0 


0 

1 

J 

c l,2 

0 

1 


0 

1 


0 

1 

C k-h2 

0 

1 
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close to being pathological in exactly the same sense. This motivates our 
approximation metric and the consideration of extreme cases. More impor¬ 
tantly, a similar bound can be achieved even if probabilities are restricted 
to being small, although the problems are not in the domain V s - c (see Ap¬ 
pendix [B]). Finding a lower bound in the domain V s - c when probabilities 
are constrained to be “small” remains an open problem. 

5.5.2 Suggested algorithm for upper bound 

We present an approximation algorithm for decomposing any matrix M £?. 
The algorithm can be roughly divided into two main stages: the first stage 
deals only with couples and assigns them to hospitals according to their 
probability in M, and the second stage ’fixes’ the probabilities of the single 
interns in M to match the assignment of the couples from the first stage. 
In the second stage the over-demanded capacity taken by the couples in any 
specific hospital is deducted from singles’ demand according to a division 
which preserves the weight of each single intern in the singles’ demand. 

The algorithm starts by “enlarging” each hospital capacity to the sum of 
the couples probabilities to be assigned to h divided by two and rounded up to 
the nearest integer. The couples now behave like singles, and for each hospital 
we add a new single to take the “extra” capacity (the new single’s demand 
is completed by a dummy hospital, h$). Doing so for all hospitals gives 
us a new stochastic assignment matrix containing only “singles” (with each 
single representing an original couple), and it can be decomposed using the 
Birkhoff-von Neumann theorem, to get a convex combination of deterministic 
assignment matrices that represent the allocation of the couples. In some of 
these assignment the couples take slightly more capacity than their expected 
share (but only up to the nearest multiple of two). 

In the second stage of the algorithm, we look at the residuals of the 
first-stage assignment. For each couples assignment and for each hospital we 
check whether the couples exceeded their expected share in that hospital, 
and if so, we let all the singles reduce their demand by their respective 
weights. The missing demand is directed at a dummy hospital, h^. We 
again have a stochastic assignment matrix containing onl y s ingles that can 
be decomposed using the Birkhoff-von Neumann theorem] 10 ! The output of 
the decomposition assigns some of the singles to the dummy hospital, so we 
move them to vacant positions arbitrarily. The algorithm then combines back 

10 Strictly speaking, the demand needs to be rounded up to the nearest integer, and this 
can be done again by adding dummy interns for each hospital, and putting all the leftover 
demand in the dummy hospital. 
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the first-stage and the second-stage assignments to get a valid deterministic 
assignment matrix with respect to the original problem. 

The full algorithm’s pseudo-code can be found in Algorithm [0 

Theorem 3. Algorithm Q] provides an f -approximation on V s - c , where 

f(P ) = 2/g. 

Lemma 4. Algorithm [I] runs in polynomial time. 

Proof. The size of the convex combination {(A fc , M fc ) }f^_ related to the cou¬ 
ples is polynomial, also we use Birkhoff’s proof which provides a constructive 
algorithm for the implementation of the decomposition |36| in polynomial 
time, so the whole algorithm is polynomial. □ 

Proof of Theorem 0 In the first part of the algorithm (lines 1 through 10) 
the couples are allocated exactly their demanded capacity. This is done 
by taking only the couples, treating them as single agents, enlarging the 
capacities related to the couples so that they would be an integer, and add a 
dummy hospital to take all the left-over probabilities of the dummy agents. 
Then it is possible to use the Birkhoff-von Neumann decomposition. 

We note that since each hospital has only one related dummy agent, 
then the capacity taken by the couples is always either the full (rounded up) 
capacity of the hospital according to M', or one less than that. If it is the first 
case, the singles can be allocated their entire demand for this hospital in the 
next step. However, if the couples exceeded their fractional capacity, we “fix” 
the demand of the single interns, and each intern i loses 2 (|~g^~| — q' h ) times 
his weight in the sum of probabilities of singles, where q' h = M Cl ,h- 

Let yn be the total weight given the matrices of the second kind (for a 
certain hospital h ). Then the following holds: 

(! - y) ■ W\ + y ■ Wh\ = q'h ( 2 ) 

If we denote by Xh = q' h — \_q' h \ the fractional part of the couples’ total 
probability we get y^ = Xh/ 2. Now each time the second case materializes, 
the singles lose 2 — Xh- The total loss is given by (2 — Xh ) • yy, which is 
maximized at Xh = 1. Since we assumed that each hospital contains more 
singles interns than interns who are part of a couple, each single s losses at 
most M s , h ■ \/^ = M S}h /q h . 

Across all hospitals, each single intern s losses at this point at most 
EheH M s,h/dh < Cf2heH M s,h)/g = 1/V However, in the last phase of the 
algorithm (line 27), singles receive back their “lost probability” in arbitrary 
hospitals, and so each single can deviate up to 1 /q further from her endow¬ 
ment, resulting in a total of at most 2/q. □ 
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Algorithm 1 Approximation 


Input: A stochastic assignment matrix M 


IxH 


e V 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 : 

27 : 


28 

29 

30 

31 


5" = C U H , W = H U {h 0 } 

Create a new matrix M' £ [0, l]l s, l x l- ff, l (initialize with zeros) 

for all h £ H do 
for all c £ C do 

K,h = M cuh 

end for 

K,h = \Eczc<h\ - Ecec- 




M h,h<t - 1 


K,h 


c,h 


end for > We get that (S', 0, H ', M') — P’ £ p 0-0 

Decompose M 1 into a convex combination of deterministic assignment 
matrices {(A(with respect to P') 

Create an empty set of allocations i/j = {} 
for k = 1 to K do 

P = PU{h 0 } 

Create a new matrix M £ [0, l]l s 'l x l i h (initialize with zeros) 

for all s £ S do 


quota 


for all h £ 

H do 

if YlceC 

Kh > E. 

M S)h 

— M Sjh 

else 


M a , h 

= M s , h 

end if 


end for 


M s ,h 9 — 'Yhh&H M s ,h 


cgC K,h then 


M., 


£ s 


es M s',h 


> If couples exceeded their 
-ZcecK,„) 


■2(Eo 


M s ,h 

end for > We get that (S, 0, H, M) = P £ p c '= 0 

Decompose M into a convex combination of deterministic assignment 

L 


matrices \ (A , M 


i=i 


(with respect to P) 


for / = 1 to L do 

Stitch M k and M l into a valid deterministic assignment matrix 
M k,t for P. Couples get the hospitals they are assigned under M k , singles 
not assigned to h 0 get the hospitals they are assigned under M l , and 
singles assigned to h 0 get the rest of the vacant positions (arbitrarily) 

Add ^A fc • \ l , M k,lS j to -0 

end for 
end for 

Output 0 
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6 Experimental results 

In this section we characterize the interns’ preferences and provide simulation 
results using the Israeli Medical Internship Match data provided to us by the 
Israeli MoH. The data contained the preferences of interns starting from 
1995. However, due to significant changes in the hospitals and the internship 
conditions in them, some of the earlier data cannot be treated as coming 
from a similar distribution to the more recent data. We therefore focused 
only on preferences from the years 2010-2014. During those years there 
were 23 possible hospitals for each intern to rate, and interns must rank all 
hospitals. The number of interns varied from one year to another, but it is 
always around 500, and the number of internship positions is always equal 
to the number of interns. Preferences of foreign trained doctors are also 
available to us, but only until the year 2012. 

6.1 Description of interns’ preferences 

Looking at the interns’ preferences, there are several interesting points to be 
made. The first is that preferences are geographically driven. For example, 
ranking the hospitals according to the number of students who listed this 
hospital as their top priority (this is the ranking published by the MoH), 
we see that in the top five hospitals there are two hospitals from the center, 
one from the north, one from the south, and one from the Jerusalem area 
(these are four of the districts in Israel). However, for any given student, 
such a geographically dispersed ranking is very unlikely. Indeed, except for 
Ichilov hospital (a hospital in the center of Israel which represents the top 
option for most of the population), it is common that all the top choices of 
a student come from the same geographic area0 To measure this effect, 
Figure [5] shows the percentage of interns whose top k choices all came from 
the same area, for k = 2 to 10 (this graph ignores Ichilov hospital). Note 
that if choices were random, one would expect an exponential decay, where 
here we see a linear decay. 

Another interesting point is the differences between the Israeli interns 
(interns who studied medicine in Israel) and the foreign trained doctors. For¬ 
eign trained graduates have even stronger geographic preferences than Israeli 
ones. This helps explain another difference between these two populations’ 
preferences: their performance in the old RSD mechanism. Under that mech¬ 
anism the average rank of hospital that the Israeli interns received was 4.594, 

11 It is known that one can always swap Ichilov hospital later for something else. Hence 
some students pick it first, because they want the flexibility - the lottery is conducted 
between ten and eighteen months before the internship starts. 
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Figure 5: Geographical orientation at the top of the list 



whereas the foreign trained doctors (who are matched separately) received 
an average rank of 2.549. Similarly, under the new mechanism the average 
rank of Israeli interns was improved to 3.686, and the foreign trained doctors 
got 2.042. Our hypothesis was that the foreign trained doctors’ preferences 
therefore exhibit more heterogeneity in some sense 1^1 

To verify that foreign trained doctors’ preferences are more heterogeneous, 
we grouped preferences according to the top three hospitals, sorted by fre¬ 
quency, and plotted the percentage of students who chose the most common 
triplet, one of the two most common triplets, and so on. For example, if 
the most common triplet for the foreign trained doctors is A, B , C, and the 
second most common is 5, C, A, we are interested in the fraction of students 
whose top three choices are A, B,C, and at the fraction of students whose 
top three choices are either A, B , C or B. CL A. The results are presented in 
Figure |6j The right panel shows that the cumulative distribution of local in¬ 
terns is much higher than the cumulative distribution of the foreign trained 
doctors. The top ten triplets cover almost 50% of the Israeli interns, but 
only about 20% of the foreign trained doctors. Amazingly, even the density 
of each of the top 10 triplets (which are not necessarily the same among the 
two populations) is higher for the local interns (as depicted by the left panel). 
This indicates that their preferences are indeed much more homogeneous, at 
least at the top of the distribution. We note that Figure 0 suggests that for¬ 
eign trained doctors do exhibit a higher tendency to group hospitals by area. 
The only reasonable conclusion is that while foreign trained doctors care a lot 
about geography, there are less prone to select necessarily the center of Israel 
as their preferred location, or that they tend to have more heterogeneity in 
ranking hospitals within each area. 

We also tried to study the differences between the preferences of singles 
and couples, but did not reach any significant conclusions. Furthermore, 

12 The common conspiracy theory is that the MoH gives them more seats at the preferable 
hospitals. We have verified the numbers and the conspiracy theory is incorrect. 
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Figure 6: Distribution of top triplets (left panel: density, right panel: cumu¬ 
lative) 



while we did not perform a thorough analysis, it is likely to assume that 
preferences depend also on the medical school in which an intern completed 
her medical studies, for both personal reasons (already lives in the same city) 
and professional reasons (knows the hospital better from her time at medical 
school). This by itself might not have an impact on designing a mecha¬ 
nism, but it may be of interest for understanding better the distribution of 
preferences and what creates heterogeneity. 

6.2 Simulation results for the algorithm 

In the simulations we used a subsampling method, i.e., we drew preference 
profiles from the union of all data points, where we distinguish between 
preferences that were submitted by single interns and preferences that were 
submitted by couples. This allowed us to create multiple “possible markets”, 
each with |/| = 496 (which was the actual number of interns in 2014) and 
with 24 couples. For each point in the figure we used 5,000 different market 
draws. 

To create Figures [7] and El we drew markets (in the sense explained above), 
then used our assignment mechanism to generate the matrix M, to which Al¬ 
gorithm Q] was applied. While Theorem El only ensures approximation quality 
of 2/q, we can clearly see in Figure [7| that although in our case q — 4, the 
average performance of our algorithm is far better than | (the red vertical 
line). One reason for the difference between the theoretical prediction and 
the data is that our analysis assumes that the percentage of singles’ weights 
is equal across all hospitals (see also the concluding discussion). The second 
and much more important reason is that even after applying the probability 
allocation mechanism, pathological examples similar to the one presented in 
Theorem [2] are relatively rare. A third reason is that the arbitrary way in 
which singles were allocated at the end of the algorithm may in fact improve 
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the performance]^ We note that the spike at the left side of the distribution 
occurs exactly because of a small hospital to which couples are likely to be 
allocated because of their (non-random) preferences (indeed, this spike disap¬ 
pears completely in Figure [9] which uses random allocation of probabilities). 


Figure 7: The histogram shows the distribution of the maximum value over 
the L\ norm of each intern in the absolute distance between the original 
assignment matrix and the approximated one. The red dashed vertical line 
represents the theoretical upper bound. 



For completeness, Figure [8] depicts the distribution of ^ ^ lw > 

which is the average Li norm. This metric was not analyzed theoretically, but 
it is of much interest to the social planner. It roughly means that while the 
intern who got her probability vector changed the most suffered an average 
of at about 15% change, the average intern’s probability vector was only 
changed by less than 2%. 


Figure 8: The histogram shows the distribution of the average value over 
the Li norm of each intern in the absolute distance between the original 
assignment matrix and the approximated one. 



In order to separate the effect of the LP mechanism used to generate 
probability matrices, we also produce similar figures for matrices produced 
randomly using the iterative algorithm of [42| (but only select those within 

13 One may consider replacing the arbitrary assignment with a more sophisticated 
method to slightly improve the results. We did not pursue this direction any further. 
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our sub-domain of interest, V s - c ), with the same market size as used for 
Figures [7] and [SJ Figure [U] indeed reveals that using random matrices reduces 
the maximal impact by a factor of about 5. However, it is evident from 
Figure [TUI that the average intern’s probability vector is now more susceptible 
to changes following the application of our approximation algorithm. 


Figure 9: The histogram shows the distribution of the maximum value over 
the Li norm of each intern in the absolute distance between the original 
assignment matrix and the approximated one, for randomly generated values. 



Figure 10: The histogram shows the distribution of the average value over 
the L\ norm of each intern in the absolute distance between the original 
assignment matrix and the approximated one, for randomly generated values. 



o 1 - 1 -^- 1 - 

0.02 0.025 0.03 0.035 0.04 0.045 0.05 


Finally, we also tried to see what would have happened to the performance 
of our algorithm if couples’ preferences were more capacity-driven compared 
to their true distribution. For this test, we used the same distribution for 
singles’ preferences, but each couples’ preference was created by drawing 
hospitals one by one, where every time the draw is from those hospitals not 
drawn yet, and with the capacity of the hospital being the weight in the 
random draw. This made couples like big hospitals better than they like 
small hospitals, and thus the minimal weight of singles in all hospitals went 
up, and our algorithm’s performance was improved (see Figures fill and fl2|) . 

Figure QJ] presents a funny double-peaked distribution. The reason for 
this is not lack of experiments, but rather whether or not a couple has a 
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Figure 11: The histogram shows the distribution of the maximum value over 
the Li norm of each intern in the absolute distance between the original 
assignment matrix and the approximated one, for capacity-driven couples. 



Figure 12: The histogram shows the distribution of the average value over 
the L\ norm of each intern in the absolute distance between the original 
assignment matrix and the approximated one, for capacity-driven couples. 
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chance of getting to a hospital with capacity 4. Since the LP is an affine 
maximizer, if a couple has a chance to get to this small hospital, then this 
probability will be non negligible (in the order of at least 0.3 or so). If the 
couple has a chance to get there then rounding the probabilities of the intern 
there will give the intern with the highest discrepancy in the decomposition. 
One the other hand, if no couple ever gets to that hospital the decomposition 
there is always perfect, and we need to worry about the second smallest 
hospital (which already has twice the number of interns than the smallest 
hosptial). 


7 Extensions 


7.1 Lower percentage of singles in each hospital 

For the sake of simplicity we focused on the domain V s - c in which there are 
more single interns than interns who are part of a couple in each hospital. It 
is easy to generalize our results to the domain of ■p s ^" c ' ; in which in every 
hospital the total weight allocated to single interns is at least a times the 
total weight allocated to couples (V/i G H : EseS M Sj h > a ■ 2^] cgC ,M C i^), 


for some positive a. 

The e: 

and T 7 —q — 1 couples, and the resulting bound will be 


The example given in Theorem [2] can be modified to have + 2 singles 

The approx- 


1+0 ^ “““ ^ J (xfxk+F 

imation algorithm stays exactly the same, and it provides an approximation 

Of -r-^. 


, 1 + 0 ! J2. 


On a similar note, we remark that presenting the approximation as de¬ 
pending on the minimal capacity was a matter of choice. A slightly more 
accurate bound can be formulated in terms of the minimal singles’ demand 
across hospital, i.e., min heiiYlseS M s ,h y+g). This bound works much 
better when for some reason couples focus their demand on larger hospitals. 


7.2 Groups larger than two 

In the Israeli Medical Internship Match, interns were only allowed to register 
either as singles or as couples. However, other applications may require larger 
groups to be assigned together. To extend our solution to larger groups it 
is possible to replace the first stage of the algorithm (the one which rounds 
up the capacities demanded by the couples and allocates the couples) by the 
allocation method of [34f (without the singles). Their algorithm ensures an 
allocation that deviates from the initial allocation by at most k — 1 seats in 
each hospital, where k is the size of the largest group. We can then take 
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the remaining capacities and split them among the singles in a similar man¬ 
ner, and the analysis of the upper bound will remain similar (the coefficient 
however will change according to k). Obviously, our two extensions can be 
combined. 


8 Conclusion 

In this thesis we described a recent application of knowledge and research in 
market design to the problem of allocating interns to internship positions in 
Israel, we presented a novel technique to perform assignments, and showed 
that it greatly improved the assignment of Israeli medical graduates to in¬ 
ternships, increasing the number of students who received one of their top 
choices. This method requires the medical students to “trade” probabilities 
to get to different places, and therefore creates a new comparison between 
different hospitals, based on how much they are desired in the trade. We 
presented the results of this comparison, and showed that it makes much 
more sense than the traditional one (namely order the hospitals according 
to the number of students who ranked them first). Seeing that the new rat¬ 
ing makes sense is an evidence that the probabilities in the new lottery are 
traded correctly. 

Our data exhibited several very interesting characteristics that lead us, 
for example, to recommend (or at least suggest for consideration) merging 
the two cohorts (of local interns and foreign trained doctors) and assign¬ 
ing them using the same mechanism, since there are likely significant gains 
to trade. A naive attempt that we made in putting combining these two 
populations resulted in an improvement for both populations under the new 
mechanism, and an improvement only for the local interns under RSD (with 
the foreign trained doctors receiving a worse expected rank). However, it 
is possible that the MoH will decide that the DNH principle should apply 
here as well and that the baseline that should be taken is not RSD on the 
merged population, but rather two independent runs of RSD (one for each 
population) and running the linear program on top of it. 

We expect that decomposing stochastic matrices under small comple¬ 
mentarity constraints will rise in other applications. Consider for example 
a lottery which assigns students to courses. A student needs to care about 
her probability of getting each course, but would also like the guarantee that 
two courses will not overlap. While the algorithm we presented here has a 
worst case approximation ratio 2 jq (where q is the minimal capacity in the 
problem), it behaves much better on simulated and on real data. The reason 
for this better behavior is that couples do not necessarily concentrate in the 
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small hospitals (and indeed making couples prefer big hospitals improves the 
performance). 

8.1 Open questions 

1. What is a more principled way of balancing strategy-proofness with 
efficiency? We did not pay any extra price for strategy-proofness (in 
addition to the price for the DNH principle). 

2. The students ended up giving no advantages to parents in 2014 and 
2015. How would the algorithm behave given a population which con¬ 
sists also of parents? Is there a more principled way to cope with 
parents? 

3. For the assignment problem with couples, what is the lower bound for 
approximation within the domain V s - C and with elements of M being 
“small”? 

4. For the assignment problem with couples, our approximation algorithm 
and analysis assumed that couples must be matched to the same hos¬ 
pital. Can a similar approximation be found when couples can be in 
different hospitals within the same city? 
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A Linear programming technical explanation 

To represent the individual rationality constraint , Let m be the number of 
hospitals, we define Student i’s happiness prior to the trading stage as 

hi = Pi,im 2 + Pi >2 (m - l) 2 + Pi^m - 2) 2 + ... + p i>m l 2 , (3) 

i.e., a function that represents students’ strong preference to get to their top 
ranked hospitals. If the probabilities we get after the trading stage are p iy k 
(as defined in Eq.[T]), then we define the happiness following the optimization 
as 

K = pi,im 2 + p ii2 {m - l) 2 + Pi, 3 (m - 2) 2 + ... + p itTn l 2 - (4) 

Now our constraints are hi > h { for every student i. 

Regarding the target function, as described in the Methods section, we 
want to maximize the total satisfaction of the students after trading. Letting 
n denote the number of students (in 2014, n = 496), our optimization goal 
is going to be 

n 

maxy^hj = hi + h 2 + ... + h n , (5) 

i=l 

that is, maximizing the sum of happiness for all interns. This target function, 
as well as the definition of happiness in Eq. [31 was chosen following a survey 
filled by approximately 70 interns and 6th year medical students. We note 
that using similar target functions and making small changes in the weights 
used to define happiness did not have a profound effect on the statistics of 
the assignment. 

B Lower Bound for small probabilities 

Define an example with 

H {hi, ■ ■ • j ^2m) ^ 1 ■ • • • i h 2m } 5 

S = {si,..., S 2 m( 2 fc+i)}> and 

f {pl, ■ ■ ■ , C( 2 k+l)m } 5 

where m and k are integers. Consider the target matrix M e p> s - c described 
in Table II. Under this stochastic assignment matrix each single intern in S 
gets a probability to be assigned to each hospital in {h[,, h' 2m }, and 
each couple in C gets a probability of to be assigned to each hospital in 
{hi,..., h 2 m}- One can verify that the capacity of every hospital is 2k + 1. 
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In each deterministic assignment, the couples occupy at most Amk slots 
in the hospitals in {hi ,..., h 2 m }, hence the singletons occupy at least 2 m 
slots in {hi,..., h 2 m}- This is true for every assignment, hence true for their 
convex combination as well, meaning that some single intern spends at least 
(2m) 2 (2fc+i) = 2 fe+T hi s probability in the hospitals in {hi ,..., h, 2 m }. Thus, 

the approximation cannot be better than = f y 


Table 3: Target matrix for small probabilities example 



hi 


^2 m 

h'i 


h^m 

Si 

0 


0 

1/(2 m) 


1/(2 m) 


0 


0 

1/(2 m) 


1/(2 m) 

s 2m(2k+l) 

0 


0 

1/(2 m) 


1/(2 m) 

Cl,l 

l/(2m) 


l/(2m) 

0 


0 

Cl,2 

l/(2m) 


l/(2m) 

0 


0 


l/(2m) 


l/(2m) 

0 


0 

^m(2/c+l),l 

l/(2m) 


l/(2m) 

0 


0 

C m (2fc+1),2 

l/(2m) 


l/(2m) 

0 


0 


37 


























